-
Notifications
You must be signed in to change notification settings - Fork 9
Change opening datasets to use Intake plugins #121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…stgdal, intake_questhdf5
…stgdal, intake_questhdf5
…stgdal, intake_questhdf5
…stgdal, intake_questhdf5
* Added use of enums where relevant * Re-ordered import statements
* make search catalog more flexible * Update test/test_catalog.py * disable HydroShare tests
| from .metadata import get_metadata | ||
| from .tasks import add_async | ||
| from ..util import to_geojson | ||
| from ..static import UriType, PluginType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 '..static.UriType' imported but unused
| from ..database import get_db, db_session | ||
| from ..static import DatasetStatus | ||
| from ..util import logger as log | ||
| from ..static import DatasetStatus, UriType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 '..static.UriType' imported but unused
|
|
||
| from ...static import DatasetStatus, DatasetSource | ||
| from ...util import listify, format_json_options, uuid | ||
| from ...static import DatasetStatus, DatasetSource, UriType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 '...static.UriType' imported but unused
| from io import StringIO | ||
| from geojson import Feature, FeatureCollection | ||
|
|
||
| from ..static import UriType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 '..static.UriType' imported but unused
| import numpy as np | ||
|
|
||
| from quest.plugins import IoBase | ||
| from quest.static import DataType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'quest.static.DataType' imported but unused
|
|
||
| from quest import util | ||
| from quest.plugins import ToolBase | ||
| from quest.static import DataType, UriType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'quest.static.UriType' imported but unused
| import param | ||
| from quest import util | ||
| from quest.plugins import ToolBase | ||
| from quest.static import DataType, UriType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'quest.static.UriType' imported but unused
| from quest.plugins import ToolBase | ||
| from quest.api import get_metadata, update_metadata | ||
| from quest.plugins import load_plugins | ||
| from quest.static import UriType, DataType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'quest.static.UriType' imported but unused
|
|
||
| from quest.plugins import ToolBase | ||
| from quest.api import get_metadata, update_metadata | ||
| from quest.static import DataType, UriType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'quest.static.UriType' imported but unused
| from quest.plugins import ToolBase | ||
| from quest import util | ||
| from quest.plugins import ToolBase | ||
| from quest.static import DataType, UriType, GeomType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'quest.static.UriType' imported but unused
|
Can I help at all in this process? |
@martindurant thanks for offering to help with this. I am working to release a few bug fixes before integrating this PR in, so I have not yet had time to review it. I would be interested in you reviewing what @douggallup has done with using Intake in Quest and offer any suggestions you may have. |
martindurant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few thoughts from my point of view.
Note that I am probably missing a lot of the context from the Quest side, since I am not familiar with its internals.
| m = get_metadata(dataset).get(dataset) | ||
| file_format = m.get('file_format') | ||
| path = m.get('file_path') | ||
| intake_plugin = m.get('intake_plugin') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the class that loads data for intake is now usually called a "driver" - there are many types of plugins. https://intake.readthedocs.io/en/latest/glossary.html
| # Use intake plugin to open | ||
| if intake_plugin: | ||
| # New code, with 'intake_plugin' added to the local .db | ||
| plugin_name = 'open_' + intake_plugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems somewhat fragile.
You could look directly in the registry
import intake
cls = intake.registry[intake_plugin]
source = cls(*args, **kwargs)
or, perhaps better, you could construct either the relavant YAML block or a intake.catalog.local.LocalCatalogEntry instance, and have Intake do the lookup for you.
Note that in the Intake world, the driver here could be something like "parquet", but it can also be the fully-qualified class name like "intake_parquet.ParquetSource". Of course, if you have additional constrains within Quest, that's fine.
| file_path = orm.Optional(str, nullable=True) | ||
| visualization_path = orm.Optional(str) | ||
| intake_plugin = orm.Optional(str) | ||
| intake_args = orm.Optional(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A JSON representation of arguments, correct? If these are stored as strings, would it make sense to use the same YAML spec used by Intake text-file catalogs?
| 'elevation': 'elevation' | ||
| } | ||
|
|
||
| def download(self, catalog_id, file_path, dataset, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know the context here, but you should be aware that Intake also has the ability to download source data files on first use https://intake.readthedocs.io/en/latest/catalog.html#caching-source-files-locally
Changes the method of opening datasets from using quest_io_plugins, to using the intake library (https://github.com/ContinuumIO/intake). Requires the intake-xarray plugin (https://github.com/ContinuumIO/intake-xarray) for reading rasters, and the intake_questhdf5 plugin (https://github.com/Aquaveo/intake_questhdf5) for reading hdf5 files (both XY and timeseries).
The local metadata db is changed so that there is an 'intake_plugin' and 'intake_args' fields for the dataset, which gives the appropriate plugin (ex: 'rasterio' from intake-xarray, 'quest_xyHdf5' or 'quest_timeseries_hdf5' from intake-questhdf5), along with the necessary arguments needed to use that plugin (ex: path, chunks, etc). The intake registry will give a list of plugins installed.
This only replaces opening a dataset, and does nothing for the visualization or output capability of the quest io plugins.